AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
English Visual Interaction

# English Visual Interaction

Qwen2 VL 2B Instruct GGUF
Apache-2.0
Qwen2-VL-2B-Instruct is a multimodal vision-language model that supports image-text generation tasks, based on the Qwen2 architecture with a parameter scale of 2B.
Image-to-Text English
Q
second-state
125
3
Florence 2 VLM Doc VQA
A specialized version for Visual Question Answering (VQA) fine-tuned based on microsoft/Florence-2-base-ft, capable of interpreting image content and answering related questions
Text-to-Image Transformers English
F
prithivMLmods
69
4
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase